2024 iThome 鐵人賽

DAY 27

Software Development

螃蟹幼幼班：Rust 入門指南系列第 27 篇

Day27 - 閉包

16th鐵人賽

blueye

2024-10-11 01:41:24

608 瀏覽

分享至

在 Rust 裡， 函數也是一等公民(first-class citizen)，這代表函數也可以當成其他函數的參數和回傳值，所以可以支援一些 functional programming 風格的寫法。

閉包的定義與基本使用

其中很常用的就是閉包(closures)，Rust 的閉包就是一種匿名函數，可以被賦值給一個變數或是當成參數傳給其他的函數，和函數最大的差別是，閉包可以從它們所定義的作用域中捕獲變數。

以下是使用閉包的一個簡單範例：

fn main() {
    let x = 5;
    let add_x = |y| x + y; // 閉包獲取外部變數 x
    println!("{}", add_x(7)); // 12
}

|| 是用來區隔閉包參數，以這邊來說就是 y。和函數一樣，閉包可以有多個參數，也可以定義參數和回傳值的型別，但大多數情況下編譯器可以自動推導，不需要明確指定。這也是閉包與函數的一大不同點。

比較簡單的邏輯可以在 || 後面直接接一個表達式代表這個閉包要回傳的結果，甚至連大括號都可以省略，若需要多行邏輯，則必須使用大括號，就像一般的函數一樣。

另外也可以看到，雖然參數沒有 x ，但是後面的表達式可以直接把 x 拿來用，就是前面提到的捕獲變數的意思。這也是閉包一個很重要的特性，它可以讓我們的程式碼比使用函數變得更精簡。

函數和閉包在使用上差異比較大的地方是，函數是顯式公開給使用者的的介面，所以輸入輸出都要定義好讓使用者知道可以怎麼用，而閉包通常都是局部或自己使用的，限制使用範圍和情境之後編譯器可以推導出大部分的型別。

雖然閉包參數大部分情況可以省略型別詮釋，但前提是一定要使用到它，不然編譯器會報錯，原因是編譯器無法判斷它的型別是什麼，編譯器可以自動判斷的依據是看使用的地方實際傳的值判斷的。可以試著把最後一行的 println! 註解掉，就會有以下錯誤：

error[E0284]: type annotations needed
 --> src/main.rs:3:18
  |
3 |     let add_x = |y| x + y; // 閉包獲取外部變數 x
  |                  ^    - type must be known at this point
  |
  = note: cannot satisfy `<i32 as Add<_>>::Output == _`
help: consider giving this closure parameter an explicit type
  |
3 |     let add_x = |y: /* Type */| x + y; // 閉包獲取外部變數 x
  |                   ++++++++++++

For more information about this error, try `rustc --explain E0284`.

閉包與型別推導

另外 Rust 的閉包是不支援泛型的，參數的型別會在第一次使用時確定，例如：

fn main() {
    let print = |x| println!("input is {}", x);
    print("hello");
    print(1);
}

error[E0308]: mismatched types
 --> src/main.rs:4:11
  |
4 |     print(1);
  |     ----- ^ expected `&str`, found integer
  |     |
  |     arguments to this function are incorrect
  |
note: expected because the closure was earlier called with an argument of type `&'static str`
 --> src/main.rs:3:11
  |
3 |     print("hello");
  |     ----- ^^^^^^^ expected because this argument is of type `&'static str`
  |     |
  |     in this closure call
note: closure parameter defined here
 --> src/main.rs:2:18
  |
2 |     let print = |x| println!("input is {}", x);
  |                  ^

可以把不同行的 print 註解掉比較看看，雖然閉包的邏輯都能處理，但閉包參數型別會在第一次使用的時候就定下來，之後如果用不同型別就會報錯。

閉包取得數值的方法

閉包從環境獲得數值和和函數取得參數的方法相對應，包含不可變參考、可變參考以及取得所有權 3 種。閉包自己會依照函數本體來決定要用哪種方式獲取數值。

不可變參考：

fn main() {
    let list = vec![1, 2, 3];
    let immutable_borrow = || println!("in closure {:?}", list);
    println!("before closure borrow {:?}", list);
    immutable_borrow();
    println!("after closure borrow {:?}", list);
}

把閉包賦值給一個變數後，閉包內取得 list的不可變參考，所以這個變數的取得的引用作用域一樣是從它被宣告到它被使用，但不可變參考可以同時有多個，引用作用域重疊不會有問題。

可變參考：

fn main() {
    let mut list = vec![1, 2, 3];
    let mut mutable_borrow = || list.push(4); // 可變參考
    println!("before closure borrow {:?}", list); / 不可變參考
    mutable_borrow();
    println!("after closure borrow {:?}", list);
}

list.push 會改變原有的 list，所以閉包需要的是可變參考，我在宣告變數到實際使用中間塞了一句不可變參考，這樣就會踩到同時有不可變與可變參考，編譯器會報錯，要把 before 那句移掉。

error[E0502]: cannot borrow `list` as immutable because it is also borrowed as mutable
  --> src/main.rs:12:44
   |
11 |     let mut mutable_borrow = || list.push(4);
   |                              -- ---- first borrow occurs due to use of `list` in closure
   |                              |
   |                              mutable borrow occurs here
12 |     println!("before closure borrow {:?}", list);
   |                                            ^^^^ immutable borrow occurs here
13 |     mutable_borrow();
   |     -------------- mutable borrow later used here
   |
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

另外一個滿有趣的地方是 mutable_borrow 前面也要加 mut才能編譯成功，強制性的。
即使這個變數本身沒有要被改變，但因為它儲存的閉包可能會改到外部(list)的狀態，所以需要加上 mut 告訴編譯器，編譯器才能進行正確的借用檢查。

如果沒有加，會有以下錯誤：

error[E0596]: cannot borrow `mutable_borrow` as mutable, as it is not declared as mutable
 --> src/main.rs:4:5
  |
3 |     let mutable_borrow = || list.push(4);
  |                             ---- calling `mutable_borrow` requires mutable binding due to mutable borrow of `list`
4 |     mutable_borrow();
  |     ^^^^^^^^^^^^^^ cannot borrow as mutable
  |
help: consider changing this to be mutable
  |
3 |     let mut mutable_borrow = || list.push(4);
  |         +++

取得所有權

再來看取得所有權的情況，可以分為從從環境獲得數值和從參數護得兩種。

從環境獲得數值：

fn main() {
    let s = String::from("hello");
    let get_modified_str = || {
        let mut w = s; // 環境數值 s 所有權轉移
        w.push_str(", world!");
        w
    };

    // println!("s: {s}"); // 已經失去所有權無法使用

    let y = get_modified_str(); // 這邊才真正執行閉包
    println!("y: {y}");
}

原本會取得所有權的寫法當然用閉包寫結果也是一樣的，閉包內把所有權消耗掉之後即使閉包未執行也無法使用原本的變數。

透過參數傳進閉包：

fn main() {
    let s = String::from("hello");
    let get_modified_str = |mut z: String| {
        z.push_str(", world!");
        z
    };

    // println!("s: {s}");

    let y = get_modified_str(s);
    println!("y: {y}");
}

這種情況閉包的參數就必須填型別了，因為編譯器無法確定我們到底是想要取得可變參考(z: &mut String) ，還是轉移所有權(mut z: String)，兩者的用法和結果會完全不同，編譯器沒把握的情況下就會要求我們顯式標出型別，不會自己決定。

強制轉移所有權

還有些情況是我們希望強制轉移所有權到閉包，可以搭配 move 關鍵字，這樣閉包捕獲的變量的所有權就移到閉包中，原本的變數無法再使用。

fn main() {
    let mut list = vec![1, 2, 3];
    let mut move_ownership = move || list.push(4);
    move_ownership();
}

通常會用到 move 大概有幾種情況：

閉包執行完之後原本變數也不需要再用到的時候
將閉包儲存，之後再執行，處理延遲計算的情況
跨執行緒傳送資料

執行緒那個滿有趣，因為如果沒有把所有權轉出去，新的執行緒實際在執行的時候沒辦法知道原有變數在主執行緒的狀況，如果作用域結束那資料記憶體被釋放掉，新執行緒就會拿到空指標了。
基本上大原則還是要確保當變數要執行的時候它仍然是有效的。

use std::thread;
use std::time::Duration;

fn main() {
    let mut handles = vec![];

    for i in 0..5 {
        let i = i;
        let handle = thread::spawn(move || {
            println!("Task {} is running", i);
            thread::sleep(Duration::from_secs(1));
            println!("Task {} end", i);
        });
        handles.push(handle);
    }

    println!("Main thread continues...");

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Main thread end");
}

上面是跨執行緒傳送資料的情況，如果閉包前沒有用 move 編譯器會報錯，所以也不用擔心會忘記寫：

error[E0373]: closure may outlive the current function, but it borrows `i`, which is owned by the current function
  --> src/main.rs:9:36
   |
9  |         let handle = thread::spawn(|| {
   |                                    ^^ may outlive borrowed value `i`
10 |             println!("Task {} is running", i);
   |                                            - `i` is borrowed here
   |
note: function requires argument type to outlive `'static`
  --> src/main.rs:9:22
   |
9  |           let handle = thread::spawn(|| {
   |  ______________________^
10 | |             println!("Task {} is running", i);
11 | |             thread::sleep(Duration::from_secs(1));
12 | |             println!("Task {} end", i);
13 | |         });
   | |__________^
help: to force the closure to take ownership of `i` (and any other referenced variables), use the `move` keyword
   |
9  |         let handle = thread::spawn(move || {
   |                                    ++++

閉包特徵

根據閉包對捕獲變數的操作，Rust 定義了 3 種閉包使用的特徵：

FnOnce：只能被呼叫一次的閉包。這種閉包可能會消耗或移動捕獲的變量，因此無法再次被呼叫。
FnMut ：可以被多次呼叫的閉包，並且可以修改捕獲的變數。
Fn：可以被多次呼叫的閉包，且不會修改捕獲的變量或是根本不會從環境捕獲變數的閉包。通常用於不變的數據或純函數。

這些特徵會根據閉包本體的操作自動實作，所有閉包至少都實作 FnOnce，也可能同時有多種。
當函數被當成參數傳進其他函數的時候，也會被視為閉包，根據函數本體的操作也會實作這幾種特徵。
在函數需要特徵界限來限定參數是怎麼樣的函數的時候就很重要。

比如說 Option 的其中一種方法 unwrap_or_else，裡面就有用到特徵界限：

impl<T> Option<T> {
    pub fn unwrap_or_else<F>(self, f: F) -> T
    where
        F: FnOnce() -> T
    {
        match self {
            Some(x) => x,
            None => f(),
        }
    }
}

這邊的 <T> 就是一般的泛型，另外 <F> 是指 function 的泛型， where 子句限制這個函數要具有 FnOnce 特徵並且輸出 T 型別。
符合 FnOnce 特徵的條件其實很寬鬆，所以實際上只要是可以輸出 T 型別的函數都可以。

unwrap_or_else 總結來說就是從 Some 取出值，如果是 None 就執行指定的函數：

fn main() {
    let maybe_number: Option<i32> = None;

    let number = maybe_number.unwrap_or_else(|| {
        println!("no value, use default");
        0
    });

    println!("number {}", number);
}

輸出結果：

no value, use default
number 0

再來看一個 FnMut 的例子，這個閉包必須要改變數值並且可以呼叫多次。
我們設計一個函數consume_and_modify：根據傳進來的閉包來修改向量上資料的函數。有兩個泛型型別T和F，T是向量內數值的型別，同時也限制閉包的範圍：只有一個參數是 T型別可變參考，輸出T型別。
可以看到consume_and_modify有一個 for迴圈會執行多次閉包，所以才會限制閉包的型別是FnMut而不是FnOnce。

fn consume_and_modify<T, F>(mut vec: Vec<T>, mut f: F) -> Vec<T>
where
    F: FnMut(&mut T) -> T,
{
    for item in &mut vec {
        *item = f(item)
    }
    vec
}

fn main() {
    let numbers = vec![1, 2, 3, 4, 5];
    let new_numbers = consume_and_modify(numbers, |x: &mut i32| *x * 2);
    println!("{:?}", new_numbers); // [2, 4, 6, 8, 10]
}

上面我們的 T 用的是 i32，閉包的作用是把i32的數值乘 2，原始向量是 [1, 2, 3, 4, 5]，經過操作後得到的新向量是[2, 4, 6, 8, 10]